feat(vision): add optional OmniParser HTTP provider (#1053)#1107
Conversation
Constraint: OmniParser must remain an external, explicitly configured HTTP service so core OpenChrome stays dependency-light. Rejected: Add Python/Torch/Docker or make OmniParser the default provider | violates the adapter-only issue boundary and would risk harness stability. Confidence: high Scope-risk: moderate Directive: Keep external visual providers timeout-bounded, warning-backed, and safe to fall back to DOM snapshots. Tested: npm test -- --runInBand tests/vision/omniparser-http-provider.test.ts; npm test -- --runInBand tests/vision/perception-snapshot.test.ts tests/vision/omniparser-http-provider.test.ts; npm run build; npm run lint:tier; npm run lint:changed; git diff --check Not-tested: Full npm test not rerun because the preceding PR branch already exposed unrelated tests/tools/tabs.test.ts failures; npm run lint still has unrelated baseline errors in src/session-manager.ts and src/tools/connect.ts; live OpenChrome MCP mock-server verification not run in this worktree.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Resolve the stacked vision_find conflict so the optional OmniParser HTTP snapshot provider coexists with occlusion, iframe, and tiled DOM capture options inherited from the perception snapshot base. Constraint: PR #1107 targets the perception-snapshot branch after that branch learned richer DOM capture options. Rejected: Dropping tiled/iframe options or the OmniParser provider | both are opt-in surfaces and do not need to conflict. Confidence: medium Scope-risk: moderate Directive: Keep OmniParser HTTP opt-in and preserve legacy DOM output as the default vision_find behavior. Tested: npx jest tests/vision/omniparser-http-provider.test.ts tests/vision/vision-find.test.ts tests/vision/perception-snapshot.test.ts --runInBand --forceExit; npm ci; npm run build; git diff --check for touched file. Not-tested: Full GitHub Actions matrix. Co-authored-by: OmX <omx@oh-my-codex.dev>
Merge rationale (stack consolidation)Intent. Implements issue #1053 as a stacked follow-up on the perception-snapshot stack (#1052/#1093) — an opt-in HTTP adapter for an already-running OmniParser-compatible service. Why this is correct.
CI. This PR targets the perception-snapshot feature branch ( Merging into the feature-branch base (stack consolidation). |
Progress / Review status
Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.
feat/1053-omniparser-http→feat/1052-perception-snapshota02b5d7— Compose OmniParser snapshots with current vision capture optionsOwner comment cleanup: 0 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.
Summary
Implements #1053 as a stacked follow-up on #1093 / #1052:
OmniParserHttpProvider, an optional HTTP adapter for an already-running OmniParser-compatible serviceOPENCHROME_VISION_PROVIDER=omniparser-httpand snapshot output is requestedOPENCHROME_OMNIPARSER_URL,OPENCHROME_OMNIPARSER_TIMEOUT_MS, andOPENCHROME_OMNIPARSER_MAX_ELEMENTSparsed_content_listratio/pixel bboxes intoPerceptionSnapshotelementsStack / duplicate check
feat/1052-perception-snapshotbecause this depends on feat(vision): add provider-neutral perception snapshots (#1052) #1093's provider-neutral snapshot contract.vision_find; this PR adds only an optional external provider adapter.package.json.Verification
Passed:
npm test -- --runInBand tests/vision/omniparser-http-provider.test.tsnpm test -- --runInBand tests/vision/perception-snapshot.test.ts tests/vision/omniparser-http-provider.test.tsnpm run buildnpm run lint:tiernpm run lint:changedgit diff --checkBaseline notes:
npm run lintstill fails on unrelated existing errors:src/session-manager.ts:959:7no-useless-catchsrc/tools/connect.ts:43:69@typescript-eslint/ban-typesnpm testwas not rerun here because the preceding feat(vision): add provider-neutral perception snapshots (#1052) #1093 worktree already exposed unrelated existing failures intests/tools/tabs.test.ts.Merge validation path via OpenChrome
After #1093 lands, validate this PR with an OmniParser-compatible mock service:
OPENCHROME_VISION_PROVIDER=omniparser-httpOPENCHROME_OMNIPARSER_URL=http://127.0.0.1:<mock-port>/parse/vision_findwith{ "format": "snapshot", "includeImage": false }.provider: "omniparser-http",source: "omniparser-http"elements, bounded labels, and normalizedbboxRatio.omniparser-httpdoes not appear in default snapshot output.Closes #1053.